AITopics | convolutional layer

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, retinal response, (18 more...)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.72)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hierarchical Spatio-Channel Clustering for Efficient Model Compression in Medical Image Analysis

Hamlomo, Sisipho, Atemkeng, Marcellin, Likassa, Habte Tadesse, Ravelo, Blaise, Bouwmans, Thierry, Lalléchère, Sébastien, Vacavant, Antoine, Chen, Ding-Geng

arXiv.org Machine LearningApr-28-2026

Convolutional neural networks (CNNs) have become increasingly difficult to deploy in resource-constrained environments due to their large memory and computational requirements. Although low-rank compression methods can reduce this burden, most existing approaches compress spatial and channel redundancy independently and therefore do not fully exploit the localised structure within convolutional feature maps. This paper proposes a hierarchical spatio-channel low-rank compression framework for CNNs that exploits redundancy across spatial regions and channel activations. Unlike conventional methods, which apply a uniform decomposition across an entire layer, the proposed approach first partitions feature maps into spatial regions, then groups channels according to their co-activation patterns within each region, and finally applies rank-adaptive SVD to each resulting spatio-channel cluster. The method is evaluated on an AlexNet-based brain tumour MRI classification model and compared with Global SVD and Tucker decomposition under \(3\times\) and \(6\times\) compression budgets. Our method outperforms both baselines, reducing FLOPs from \(8.21\,\mathrm{G}\) to \(1.55\,\mathrm{G}\) (\(81.1\%\) reduction), achieving a \(1.38\times\) inference speed-up, and increasing classification accuracy from \(87.76\%\) to \(89.80\%\). The method also improves the macro \(F_1\)-score and performance on challenging classes such as meningioma. A hyper-parameter trade-off analysis demonstrates that the framework provides Pareto-optimal configurations, enabling control over the balance between compression and predictive performance. Moderate clustering with adaptive rank selection yields strong results. Bootstrap standard errors are reported for all classification metrics.

artificial intelligence, decomposition, machine learning, (19 more...)

arXiv.org Machine Learning

2604.23375

Country:

North America > United States (0.93)
Africa (0.68)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)

Add feedback

51f15efdd170e6043fa02a74882f0470-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 22:00:30 GMT

arxiv preprint arxiv, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)

Add feedback

4c4c937b67cc8d785cea1e42ccea185c-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 19:14:41 GMT

Proof of Proposition 1. Due to Jensen's inequality and the fact that, by assumption, the distribution of human predictions P(h|x) is not a point-mass, it holds that Eh[`(h(x),y) |x] > `(µh(x),y). Proof of Theorem 3. We first provide the proof of the unconstrained case. Note that the above problem is a linear program and it decouples with respect to x. Therefore, for each x, the optimal solution is clearly given by: π m(d= 1 |x) = 1 if Ey|x[`(m(x),y) Eh|x[`(h,y)]] >0 0 otherwise Next, we provide the proof of the constrained case. To this aim, we consider the dual formulation of the optimization problem, where we only introduce a Lagrangian multiplier τP,b for the first constraint, i.e., maximize Ex π(x) Ey,h|x[`(h,y)] Ey|x[`(m(x),y)] + Ex [τP,b(π(x) b)] (13) subject to 0 π(x) 1 x X. (14) 13 The inner minimization problem can be solved using the similar argument for the unconstrained case.

artificial intelligence, machine learning, test sample, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)

Add feedback

Multi-layer State Evolution Under Random Convolutional Design

Neural Information Processing SystemsApr-25-2026, 07:44:18 GMT

Signal recovery under generative neural network priors has emerged as a promising direction in statistical inference and computational imaging. Theoretical analysis of reconstruction algorithms under generative priors is, however, challenging. For generative priors with fully connected layers and Gaussian i.i.d.

artificial intelligence, machine learning, matrix, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America > United States (0.46)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

16c5b4102a6b6eb061e502ce6736ad8a-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 07:25:15 GMT

artificial intelligence, machine learning, statistics, (17 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

VanillaNet: the Power of Minimalism in Deep Learning (Supplementary Material)

Neural Information Processing SystemsApr-25-2026, 06:34:10 GMT

The detailed architecture for VanillaNet with 7-13 layers can be found in Table 1, where each convolutional layer is followed with an activation function. For the VanillaNet-13-1.5, the number of channels are multiplied with 1.5. For classification on ImageNet, we train the VanillaNets for 300 epochs utilizing the cosine learning rate decay [5]. The λis linearly decayed from 1 to 0 on epoch 0 and 100, respectively. The training details can be fould in Table 2.

artificial intelligence, machine learning, vanillanet, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.42)

Add feedback

VanillaNet: the Power of Minimalism in Deep Learning

Neural Information Processing SystemsApr-25-2026, 06:34:06 GMT

At the heart of foundation models is the philosophy of "more is different", exemplified by the astonishing success in computer vision and natural language processing. However, the challenges of optimization and inherent complexity of transformer models call for a paradigm shift towards simplicity. In this study, we introduce VanillaNet, a neural network architecture that embraces elegance in design. By avoiding high depth, shortcuts, and intricate operations like selfattention, VanillaNet is refreshingly concise yet remarkably powerful. Each layer is carefully crafted to be compact and straightforward, with nonlinear activation functions pruned after training to restore the original architecture. VanillaNet overcomes the challenges of inherent complexity, making it ideal for resourceconstrained environments. Its easy-to-understand and highly simplified architecture opens new possibilities for efficient deployment. Extensive experimentation demonstrates that VanillaNet delivers performance on par with renowned deep neural networks and vision transformers, showcasing the power of minimalism in deep learning. This visionary journey of VanillaNet has significant potential to redefine the landscape and challenge the status quo of foundation model, setting a new path for elegant and effective model design.

artificial intelligence, deep learning, machine learning, (14 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

291d43c696d8c3704cdbe0a72ade5f6c-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 05:12:20 GMT

artificial intelligence, machine learning, segmentation, (21 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Data Science (0.93)
(2 more...)

Add feedback

Appendix

Neural Information Processing SystemsApr-25-2026, 03:48:18 GMT

In this appendix, we first introduce the datasets and evaluation metrics used in the experiments in Section A. Then, we provide extra experimental results in Section B. In Section C, we present details of network design, training scheme, and hyper-parameter tuning. We conduct experiments on 11 popular time series datasets: (1) Electricity Transformer Temperature [42] (ETTh(1,2),ETTm1) 3consists of 2 year electric power data collected from two separated counties of China. Each data point includes an "oil temperature" value and 6 power load features. The data is aggregated into 5-minutes windows, resulting in 12 points per hour and 288 points per day. A.1 Electricity Transformer Temperature (ETT) For data pre-processing, we perform zero-mean normalization, i.e., X We use Mean Absolute Errors (MAE) [17] and Mean Squared Errors (MSE) [26] for model comparison.

artificial intelligence, dataset, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.29)

Industry: